Recent enhancements in CU VOCAL for Chinese TTS-enabled applications

نویسندگان

  • Helen M. Meng
  • Yuk-Chi Li
  • Tien Ying Fung
  • Man Cheuk Ho
  • Chi-Kin Keung
  • Tin Hang Lo
  • Wai Kit Lo
  • Pak-Chung Ching
چکیده

CU VOCAL is a Cantonese text-to-speech (TTS) engine. We use a syllable-based concatenative synthesis approach to generate intelligible and natural synthesized speech [1]. This paper describes several recent enhancements in CU VOCAL. First, we have augmented the syllable unit selection strategy with a positional feature. This feature specifies the relative location of a syllable in a sentence and serves to improve the quality of Cantonese tone realization. Second, we have developed the CU VOCAL SAPI engine, a version of the synthesizer that eases integration with applications using SAPI (Speech Application Programming Interface). We demonstrate the use of CU VOCAL SAPI in an electronic book (e-book) reader. Third, we have made an initial attempt to use the CU VOCAL SAPI engine in Web content authored with Speech Application Language Tags (SALT). The use of SALT tags can ease the task of invoking Cantonese TTS service on webpages.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CU VOCAL Web Service: A Text-to-speech Synthesis Web Service for Voice-enabled Web-mediated Applications

This paper presents the implementation of the CU VOCAL Web service, one of the first Chinese text-to-speech synthesis Web services. The CU VOCAL Web service can be easily integrated with other Web services to develop innovative Web-mediated applications. We have developed a novel automatic voice alert system in the stocks domain by integrating CU VOCAL and several other Web services. This syste...

متن کامل

Embedded Cantonese TTS for multi-device access to web content

This paper describes the development of an embedded Cantonese text-to-speech synthesizer to enable multi-device access to Chinese Web content. Advancements in wireless communication is driving Web visitors from using desktop PCs to mobile handheld devices. Significant reduction in the form factors of the client devices tends to shift information delivery from the visual to the aural modality. T...

متن کامل

Exploiting unlabeled internal data in conditional random fields to reduce word segmentation errors for Chinese texts

The application of text-to-speech (TTS) conversion has become widely used in recent years. Chinese TTS faces several unique difficulties. The most critical is caused by the lack of word delimiters in written Chinese. This means that Chinese word segmentation (CWS) must be the first step in Chinese TTS. Unfortunately, due to the ambiguous nature of word boundaries in Chinese, even the best CWS s...

متن کامل

CU VOCAL: corpus-based syllable concatenation for Chinese speech synthesis across domains and dialects

This paper describes CU VOCAL, a Chinese text-to-speech synthesis system that adopts the approach of corpus-based syllable concatenation. We have demonstrated the applicability of the approach primarily for Cantonese, a major dialect of Chinese predominant in Hong Kong, South China and many overseas Chinese communities. This work extends our previous work as described in [1]. Our approach is ab...

متن کامل

Considerations in the usage of text to speech (TTS) in the creation of natural sounding voice enabled web systems

The voice enabled web is a combination of XML based markup languages, speech recognition, text to speech (TTS) and web technologies. Key to the success of voice enabled web applications is the “naturalness” of the interface. Users are much more likely to interact with a system they feel comfortable with and that responds in a human like way. This paper describes the deployment of TTS in commerc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003